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[57] ABSTRACT 

An optimizing compiler process and apparatus is disclosed 
for mare accurately and efficiently identifying live variable 
sets in a portion of a target computer program, so as to more 
eJ^eYdy^flo^^regis^ers^^in a computer central processing 
unit. The process of the invention includes the steps of 
performing a static single assignment transform to a com- 
puter program, including the addition of phi functions to a 
control flow graph. Basic blocks representing a use of a 
variable are further added to the control flow graph between 
the phi functions and definitions of the variables converging 
at the phi functions. A backward dataflow analysis is then 
performed to identify the live variable sets. The variables in 
the argument of phi functions are not included as a use of 
those variables in this dataflow analysis. Hie dataflow 
analysis may be itexatively performed until the live variable 
sets remain constant between iterations. The apparatus of the 
present invention includes a computer system having a 
memory storing a set of variables that are live during the 
execution of a portion of a computer program. This memory 
may include CPU registers, cache memory, conventional 
RAM and other types of memory. 

22 Claims, 9 Drawing Sheets 
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METHOD AND APPARATUS FOR AN "spilling" than necessary by making it appear that some 

IMPROVED OPTIMIZING COMPILER variables in the "live set" cannot co-exist in the registers 

with other variables (thus causing them to spill into 

BACKGROUND OF THE INVENTION memory) when in fact they can. 

1 Held of the Invention 5 The present invention uses an elegant method to reduce 

litis invention relates to the field of Optimizing Compil- ""T " s varia ^ ^by increasing the target 

ers for computer systems. More spedfical^e kvenZ is D ° " * ^ 

an improved method and apparatus for allocating live vari- 
ables to available Central Processing Unit ("CPU") registers SUMMARY OF THE INVENTION 
during the code optimization pass of an optimizing compiler. 10 The present invention overcomes the disadvantages of the 
2. Background above described systems by providing an economical, high 
It is desirable that computer programs be as efficient as performance, adaptable system and method for minimizin g 
possible in their execution time and memory usage. This variable ?P*}. m *** ^cation of nmcrdne registers to a target 
need has spawned the development of optirnizkg compilers. 15 s * ve v F^ le s f * ™ opmmng ; compiler. In 

Optimizing compilers typically contain a Code C^tiinSation 15 *C £f err ? d I ^ 

u- u uif i t * j * blocks) are inserted in the flow graph immediately following 

section which ^between a compiler front end and a a y ^ flode ^ &e y ^ J 

compiler back end The Code Optirmzation section takes as determining process is modified to consider these "use- 
input the 'intermediate code output by the compiler front blocks not to consider the "<D functions". This revised 
end, and operates on this code to perform various transfor- ^ p roccss has the effect of minimizing the aforementioned 
mations to it which will result in a faster and more efficient erroneous variable spill. 

target program. The transformed code is passed to the m one aspect of me present invention a code optimizer for 
compiler back end which then converts the code to a binary use in a compiler system is provided wherein the code 
version for the particular machine involved (i.e. SPARC, optimizer has portions configured to determine the set of live 
X86, IBM, etc). The Code Optimization section itself needs ^ variables for basic blocks of a target program, and to allocate 
to be as fast and memory efficient as it possibly can be. the CPU registers of a target computer architecture to the set 
For example, most code optimization sections attempt to °f ^ ve variables in a way which minimizes variable spill, 
optimize the usage of the available CPU registers in the wherein said code optimizer inserts, in at least a portion of 
target computer. This is done by determining a *1ive variable a contro1 flow graph, dummy blocks representative of a use 
set" for the target program using graphing and flow control 30 of a variable & said contro1 flow graph prior to a phi- 
techniques and rearranging the program code to minimize runction node and determines the set of live variables by 
the number of variables which must be "spilled" into considering said dummy blocks. 

memory *from the registers durmg program execution: This In another aspect of the invention a compiler system is 

<Kgistvfflocatoi^ provided wherein a cc4e optimizer portion which generates 

Variables in the average target program being compiled and 35 a ieco ? d . ^ T W™*tmgihe target program 

Wd kvolVeT0to~20l M^W^iS£L target *** live variable set according to 

*v w *u luyuoouuraiiauiw ±u the above process so as to map this live variable set to the 

ZZZv hi \ Pr °^ Sing H^ S Tif^^ designated^ registers of a target computer architecture, 
program variables the code optimizer itself could be exces- whe £ m said code optimizer inserts, in at least a portion of 
avely time consuming and use memory excessively in a control flow graph, dummy blocks representative of a use 
making these register aflocation -calculations if it does not 40 of a variable in said control flow graph prior to a phi- 
handle the optimizing calculations properly. function node and determines the set of live variables by 

In the past, attempts have been made to develop optimiz- considering said dummy blocks, 

ing compilers generally, and code optimizer modules spe- In yet another aspect of the invention, a computer system 

cifically which themselves run as efficiently as possible. A for use in compiling a target program to run on a target 

general discussion of optimizing compilers and the related 45 computer architecture having a fixed number of CPU reg- 

techniques used can be found in the text book "Compilers: ist ers is provided wherein the computer system includes a 

Mnciples, Techniques and Tools" by Alfred V. Aho, Ravi compiler system including a code optimizer configured to 

Sethi and Jeffrey D. Ullman, Addison-Wesley Publishing Co determine the target program* s live variable set according to 

1988, ISBN 0-201-10088-6, especially chapters 9 & 10 ^ above Process and to map this live variable set to the 

pages 5 13-723. One such attempt at reducing the calculation 50 designated CPU registers, wherein said code optimizer 

time of the code optimizer was to reduce the time for inserts » m at least a of a contro1 flow 8 ra P n ' du mmy 

calculating 'live variable sets" in the register allocation * locks representative of a use of a variable in said control 

process. Normally, the calculation of the live variables in a *™ flP 0 a ***** determines the 

control flow graph involves N 2 calculations and memory S *J** we ™ Mf * * ffdenng said Aimmy blocks, 

storage space, whereN=menu^ 55 a^ 

this attempt at reducing calculation time, an approach to ^^^f^^TTlf^ of hve vanables in a 

finding the "live variable set" involved the construction of com P? CT !° a designated set of CPU legufen in order to 

TT ~7 " , , . * /Tie * «x " ™ Jr 1 optimize the running of a target program, wherein, in at least 

Oie mrget s sfctic single .assignment ("SSA") graph ( and the of a adummy block represen- 

nisemon in the graph of special definitions called "O func- ^ of &mtof& vaii ablc in said control flow (£aph is 

Uons"or"<£no^ 60 inserted prior to a phi-function node and wherein theletof 

joid and different values merge. This method reonced the Uve variables is dete^^ by considering said dummy 

calculations for the number of vanables from N 3 to N, a block, 

significant reduction in time and memory usage. This 

method is described in "Register Allocation via Graph DESCRIPTION OF THE DRAWINGS 

Coloring" by Preston Briggs, PhD. thesis. Rice University 65 The objects, features and advantages of the system of the 

April, 1992 which is incorporated herein by reference. present invention will be apparent from the following 

However, this method of Briggs produces more variable description in which: 
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FIG. 1 illustrates a portion of a computer, including a figured by a computer program stored in the computer. The 

CPU and conventional memory in which the present inven- procedures presented herein are not inherently related to a 

tion may be embodied. particular computer or other apparatus. In particular, various 

FIG. 2 illustrates a typical compiler showing the position general purpose machines may be used with programs 
of the code optimizer. 5 written in accordance with the teachings herein, or it may 

FIG. 3 illustrates a large scale organization of a code P rove morc convenient to construct more specialized appa- 

ootimizer ratus to perform the required method steps. The required 

tttj-. a jn * . ^ _ . ^ ' structure for a variety of these machines will appear from the 

FIG. 4 illustrates an organization of the Register Alloca- . . rr 

tion portion of FIG. 3. ctocapdoii given. 

FIG. 5 illustrates a representation of steps in constructing DESCRIPTION OF THE PREFERRED 

the Interference Graph. EMBODIMENT 

FIG. 6 illustrates a process for finding variables in each Apparatus and methods are disclosed for allocating live 

block's live set variables to available Central Processing Unit ("CPU") 

FIG. 7 is an illustrative example of a control flow graph 15 registers during the code optimization pass of an optimizing 

for a portion of a computer program. compiler wherein register "spills" of variables are mini- 

FIG. 8 is a control flow graph for the portion of a mized. In the following description, for purposes of 

computer program illustrated in FIG. 7 after application of explanation, specific instruction calls, modules, etc., are set 

a single static assignment transform, forth in order to provide a thorough understanding of the 

FIG. 9 is a control flow graph for the program portion 20 present invention. Howeverjt will be apparent to one skilled 

illustrated in FIG. 7 after execution of a transform in in me art that me present mvention may be practiced without 

accordance with the process of the present invention. specific details. In other instances, well known circuits 

FIGS. lOA^Cis achartmustrating sets of live variables ™ d deYi «* ™ *™ " * ock ^ f ^^f r ^ 

for various basic blocks of the program portion represented P™«* mvenUon unneccssardy. Similarly, in the 

by the control flow graph of FIG. 9. 25 preferred embodiment, use is made of uniprocessor and 

multi-processor computer systems as well as the Solaris 

NOTATIONS AND NOMENCLATURE operating system, all of which are made and sold by Sun 

tn.^.ijj . ^ i.*.*,, AJ Microsystems, Inc. However the present invention may be 

The detailed descnpUons which follow are presented ticed on otner M aie systems and using 

largely in terms of procedures and symbolic representations ^ other compatible operating systems, 
of operations on data bits within a computer memory. These 

procedural descriptions and representations are the means Operating Environment 

used by those skilled in the data processing arts to most ^ environment in which the present invention is used 

effectively convey the substance of their work to others encompasses the general distributed computing system, 

skilled in the art. 35 wherein general purpose computers, workstations, or per- 

A procedure is here, and generally, conceived to be a sonal computers are connected via communication links of 

self-consistent sequence of steps leading to a desired result various types, in a client-server arrangement, wherein pro- 

These steps are those requiring physical manipulations of grams and data, many in the form of objects, are made 

physical quantities. Usually, though not necessarily, these available by various members of the system for execution 

quantities take the form of electrical or magnetic signals 40 and access by other members of the system. Some of the 

capable of being stored, transferred, combined, compared, elements of a general purpose workstation computer are 

and otherwise manipulated. It proves convenient at times, shown in FIG. 1, wherein a processor 1 is shown, having an 

principally for reasons of common usage, to refer to these Input/output ("I/O") section 2, a central processing unit 

signals as bits, values, elements, symbols, characters, terms, ("CPU") 3 and a memory section 4. The I/O section 2 is 

numbers, or the like. It should be bourne in mind, however, 45 connected to a keyboard 5, a display unit 6, a disk storage 

that all of these and similar terms are to be associated with un it 9 and a CD-ROM drive unit 7. The CD-ROM unit 7 can 

the appropriate physical quantities and are merely conve- read a CD-ROM medium 8 which typically contains pro- 

nient labels applied to these quantities. grams 10 and data. FIG. 2 illustrates a typical optimizing 

Further, the manipulations performed are often referred to compiler 20, comprising a front end compiler 24, a code 
in terras, such as adding or comparing, which are commonly so optimizer 26 and a back end code generator 28. The front 
associated with mental operations performed by a human end 24 of a compiler takes as input a program written in a 
operator. No such capability of a human operator is source language 22 and performs various lexical, syntactical 
necessary, or desirable in most cases, in any of the opera- and semantic analysis on this language outputting an inter- 
tions described herein which form part of the present inven- mediate set of code 32 representing the target program. This 
tion; the operations are machine operations. Useful 55 intermediate code 32 is used as input to the code optimizer 
machines for performing the operations of the present inven- 26 module which attempts to improve the intermediate code 
tion include general purpose digital computers or similar so that faster-running machine code will result Some code 
devices. In all cases there should be bourne in mind the optimizers 26 are trivial and others do a variety of compu- 
distinction between the method operations in operating a tations in an attempt to produce the most efficient target 
computer and the method of computation itself. The present 60 program possible. This latter type are called "optimizing 
invention relates to method steps for operating a computer in compilers" and include such code transformations as corn- 
processing electrical or other (e.g., mechanical, chemical) mon sub-expression elimination, dead-code elimination, 
physical signals to generate other desired physical signals. renaming of temporary variables and interchange of two 

The present invention also relates to apparatus for per- independent adjacent statements as well as register alloca- 

forming these operations. This apparatus may be specially 65 tion. 

constructed for the required purposes or it may comprise a FIG. 3 depicts a typical organization of an optimizing 

general purpose computer as selectively activated or recon- compiler 40. On entry of the intermediate code 42 a Control 
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Flow Graph is constructed 44. At this stage the aforemen- While a variety of program variable prioritization pro- 
poned code transformations (common sub-expression cesses and register allocation techniques exist, and are in 
e l i min a ti on, dead-code elimination, renaming of temporary current use today, all of these techniques rely on accurately 
variables and interchange of two independent adjacent identifying the live variable set for each of the various 
statements, etc.) take place 46. Instruction scheduling or 5 portions of a computer program. A variety of different 
"pipelining" may take place 48 at this point. Then "register techniques are currently available for o^termining live van- 
allocation" is performed 50 and the modified code is written . able sets. Unfortunately, however, these conventional tech- 
out 58 for the compiler back end to convert it to the binary niques are either calculation intensive, inaccurate, or both, 
knguage of the target machine (Le. SPARC, X86, etc). It is ^ example of one technique for deterniining live variable 

EkKv^ F0CCSS " "> SetS ™ 0lv6s of oefmition-use chlins. Another 

applicants invention. technique for determining live variable sets involves the 

Register Allocation analysis of static single assignment (**SSA n ) transforms of a 

The central processor unit (CPU) of most conventional program. The static single assignment transform technique 

computers typically includes a set of registers in which is usual ty regarded as having several advantages over the 
various data are stored for rapid access. During execution of 15 analysis of definition-use chains. Typically the SSA trans- 

a computer program the values of variables present in the form tends to grow in size at a slower rate than the 

program are normally stored in these registers. Modern definition-use chain as the size of the computer program 

CPU's normally possess between 32 and 64 floating point under analysis increases in size. Optimizing compilers 

registers. While sufficient for many business related appli- employing SSA transform based processes therefore tend to 
cation programs, programs related to scientific applications 20 1:011 faster and consume less memory than those relying on 

usually employ far more variables than the number of definition-use chain related processes. Several other func- 

available registers. When too many variables are present in tions commonly performed by an optimizing compiler, such 

a program, the values of the excess variables are normally as value numbering, dead code elimination and constant 

stored in cache memory or, alternatively, stored in conven- propagation, can also be performed more efficiently when 

tional random access memory Access to the values of 25 SSA transforms are employed instead of processes based on 

variables stored in cache or in conventional memory, definition-use chain types of processes. Unfortunately, 

however, takes substantially more time than accessing val- however. SSA transforms invariably provide inaccurate 

ues stored in the registers of the CPU. Typically CPU access results when live variable sets are determined by performing 

to variables that are stored in cache memory is two to three a backwards dataflow analysis on the SSA transform. The 

times slower than the time required to access the CPU 30 present invention provides a more accurate and efficient 

registers. Access to conventional memory is normally thirty method of determining live variable sets that is based on 

to fifty times slower than CPU register access time. Thus SSA transform related methodologies, 
programs tend to execute far more slowly when the values Referring now to FIG. 4 an organization of a typical 

of variables are stored in either cache or conventional Register Allocator 70 is shown. This corresponds to block 50 

memory rather than CPU registers. The storage of variables 35 in FIG. 3. On entry to the register allocation routine 72 an 

in cache memory or conventional memory is usually "Interference Graph" is constructed using "Control Row 

referred to as spill-over. Graph" techniques 74. The "Interference Graph" construc- 

A substantial amount of engineering effort has been tion is described in detail below. A "flow graph" is a directed 

devoted to avoiding or minimizing the affects of spill-over. graph the nodes of which are the tf basic block" of the target 

Often, for example, variables used at the beginning of a 40 program, A "basic block" is a sequence of consecutive 

program are no longer in use at the end of that program. statements in which flow of control enters at the beginning 

Thus a CPU register allocated to the storage of values for a of the block and leaves at the end without halt or the 

variable that is in use only at the beginning of a program may possibility of branching except at the end. Continuing in 

be more efficiently utilized by allocation to a different FIG. 4, after the Interference Graph is constructed 74, an 

variable during the execution o f the remainder of the 45 order is chosen for coloring the Interference Graph 76. Then 

program, when the first; variablejs^q I^eTirnise. Van- an attempt is made to map the "virtual registers" identified 

ables that are in use during the execution of Tportlon of a in the Interference Graph into the "physical register" set 78. 

program are often referred to as live while that portion of the During this mapping process variables which cannot be 

program is being executed. By deterrmning the set of mapped into registers are flagged to be "spilled" to memory, 

variables mat are live during the execution of a portion of a 50 "Success" in mapping the identified set of virtual registers is 

program, that is the "live variable set* 1 for that program defined as any graph wherein any two adjacent nodes have 

portion, a more efficient allocation of CPU registers can be different colors (Le. actual registers) for all possible pairs of 

made. As each portion of a program is executed, only those nodes. So in FIG. 4 the test for "success" 80 is made and if 

variables that are members of the live set for that program there is success 82 the register allocator routine is exited 84 

portion need be stored in the CPU registers. 55 as all registers have been allocated for the target process 

Typically advanced analytical processes, such as graph with no conflict of variables. If^snccess" fails 88 the . 

coloring, are used in the prioritization of program variables ^variables which were previously m^Mlo~be "spilled" are" 

and the allocation of registers. Examples of these techniques ( ajsigned to memory_86 and the register allocation routine is 

are presented in a paper entitled "Register Allocation And begun again 74 without the spilled variables. This iterative 

Spilling Via Graph Coloring" in SIGPLAN Notices, 17(6): 60 procedure continues^ until "success" as defined above is 
pages 96-105, June 1982, from the proceedings of the ACM (achieved. 

SIGPLAN 1982 Symposium on Compiler Construction. A Referring now to FIG. 5 the procedure for constructing 

further survey of graph coloring techniques used in the the Interference Graph 90 is described. On entry 92 to this 

prioritization of live variables and allocation of CPU regis- routine a first basic block is identified 94. Starting at the end 

ters is disclosed in the aforementioned PhD thesis paper 65 of the basic block, the virtual registers which are currently 

entitled "Register Allocation Via Graph Coloring" by Pre- in the live variable set at the end of the block are determined 

ston Briggs which is incorporated herein by reference. 96. For each instruction from the end of the basic block. 
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working backward to the start of the block, an edge is 
created between each virtual register defined by the instruc- 
tion and the current live variable set 98. Each virtual register 
"defined" by the instruction is removed from the live vari- 
able set 100, and each virtual register **used" by the instruc- 
tion is "added" to the live variable set 102. This process is 
performed for each instruction in the block and when there 
are no mare instructions in the block 106 the next block is 
identified if there are more blocks to complete 110. If all 
blocks have been so processed 112 the interference graph for 
this target procedure is completed At this stage the Inter- 
ference graph is used to attempt to map the virtual registers 
in the target procedure into the CPU's physical registers as 
described above with reference to FIG. 4. In the preferred 
embodiment of the present invention an improvement is 
identified in the process of finding the set of live variables 
at the end of each basic block (block 96 in FIG. 5). This 
improvement is designed to correct failures in register 
assignment/mapping which occur when the graph form of 
the program is in SSA form. Such failures result in certain 
variables being spilled to memory (thus slowing the target 
program run time) when there is no need to do so. These 
improvements are described in the context of the process far 
finding the variables in each block's live set 120 illustrated 
in FIG. 6. 

Referring now to FIG. 6, the detailed illustration of the 
process of block 96 of FIG. 5 is shown. On entry to this 
process 122, dummy blocks with uses of the phi-function 
(O-f unction) .arguments are added to the control flow graph 
124. These dummy blocks are added to an edge connected 
to a O-function, prior to the ^-function block. These 
^functions are described in more detail below. Next the 
live sets at the beginning of each block are initialized to the 
empty set 126. The global change flag which is the general 
control for this procedure is initialized to FALSE 130. Then 
starting at the end of the target program and working 
backwards over each basic block in the target program the 
process finds variables that are live at the end of the block 
This process begins by getting the last block in the procedure 

131. In order to determine the current approximation of the 
set of variables live at the end of a target block, the process 
unions the sets of variables live at the start of each block that 
is a successor of the target block in the control flow graph 

132. Then working from the last instruction in the basic 
block 133 to the first instruction in the block, if the instruc- 
tion is a **use" in a dummy block, the 4t used" variable (virtual 
register) is added to the live set 134. If the instruction is a 
"^-function" then the virtual register defined by the 
^function is subtracted from the live set 136 and the 
arguments of the <I>- function are not added to the live set. For 
all other instructions, subtract a virtual register which is 
"defined" and add a virtual register to the live set if it is 
"used" 138. If the latest instruction just processed is the first 
instruction in the block 144 this means that the block is 
completed The live set is checked to see if it changed from 
the start of the processing of this block 146 and if so 148 the 
global change flag is set to TRUE 150 and control goes to 
block 154. If the live set did not change 152 a determination 
is made as to whether there are any more basic blocks to 
process 154 and if so 156 the process gets another block 160 
and the process for this block is repeated beginning at block 
132. This process loop on die current block continues until 
the live set does not change 152 at which time the determi- 
nation of the set of live variables for this block is done. If 
there are more blocks to do 156 another block's pointers are 
obtained 160 and the process is repeated beginning at block 
132. If all blocks are completed 158 the change flag is tested 
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162 and if it is TRUE 168 the process is repeated again from 
block 130. If the change flag is False 164 the live set 
determination is complete 166. This process and the process 
of how the variables change, how the dummy and 

5 O-f unction blocks are used are now explained in some detail 
Referring to FIG. 7 there is shown a control flow graph 
210 for an illustrative portion of a computer program. In 
accordance with one aspect of the invention, a set of live 
variables is determined for this, and for every other portion 

10 of the program. Once these live variable sets have been 
determined, a more efficient allocation of CPU registers can 
then performed. For example, the values of variables that are 
not in the live variable set of the program portion illustrated 
in FIG. 7 need not be stored in CPU registers during 

l5 execution of this portion of the program. In order to deter- 
mine the live variable sets, the program portion illustrated in 
the control flow graph 210 of FIG. 7 is first divided into sets 
of basic blocks. As defined above, a basic block is a portion 
of code in a computer program having the property that each 

20 line of code in the basic block will necessarily be executed 
if the first line of code in that block is executed. Basic blocks 
are essentially self contained portions of a program. They do 
not contain branch instructions to other portions of the 
program except, perhaps, in a last line in the block 

25 The program portion illustrated by control flow graph 210 
may be considered to have been already partitioned into 
basic blocks such as blocks 212, 214, 216, 218, 219, 222 and 
224. For purposes of explanation, several of the basic blocks 
212-224 shown in FIG. 7 include various illustrative defi- 

30 nitions and uses of a variable such as X. Thus basic block 
212 includes an initial definition X=Y while basic block 219 
includes both a definition and a use of a variable with the 
expression X=X+1. Other basic blocks are seen to include 
illustrative branching type instructions of the type com- 

35 monly employed in a variety of computer programming 
languages. Basic block 214, for example, can be seen to 
form a loop including blocks 216-222. The conditional 
4i while n statement in basic block 214 and the conditional "if 
statement in basic block 216 both provide control flow 

40 branchings dependent upon their respective conditional 
arguments. It will be appreciated by those skilled in the art 
that the processes and apparatus disclosed herein could also 
be applied to software programs having basic blocks includ- 
ing the definitions and uses of several more variables. The 

45 single variable presented in the exemplary program portion 
shown in FIG. 7 is for illustrative purposes only, and should 
not be viewed as any farm of a limitation on the invention. 

In accordance with one aspect of the invention, a further 
step in the process of determining live variable sets for the 

so basic blocks 212-224 shown in FIG. 7 involves the appli- 
cation of a static single assignment ("SSA") transform to 
control flow graph 210. As part of this transform each 
statement concerning a definition of a variable is renamed to 
refer to a new variable. In FIG. 8 mere is shown a control 

55 flow graph 220 for the same program portion represented by 
the control flow 210 illustrated in FIG. 7. In the control flow 
graph 220 of FIG. 8 each definitional statement that includes 
the variable X has been renamed For illustrative purposes 
the variable X terminology has been retained, but a num- 

60 bered subscript has been added to reference the new variable 
name imparted by the transform. Thus, for example, the 
definition X=Y in basic block 212 is renamed X t =Y. Simi- 
larly the statement X=X+1 in basic block 219 is renamed 

X4=X 2 +1. 

65 Along with the renaming of variables, special phi function 
statements are also added to the program transform. These 
phi functions are included in the control flow graph 220 
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wherever two or more differing definitions of the same is thus added to the control flow graph 230 between the 

variable converge in the control flow graph. Hie phi function phi function of block 228 having X 4 in its argument and the 

is a new "definitional" instance of the variable, with the definition of the variable X4 (It. X^Xa+l) at block 219. 

argument of the phi function including the differing defini- Similarly block 234 having a use f(X 3 ) of the variable Xj is 

tions of the variable leading to the phi function. Basic block 5 added between the phi function of block 228 (further having 

214, for example, represents such a convergence in the X 3 m its argument) and the definition of the variable X 3 (i.e., 

definitions of X That is, at basic block 214 the definitions X 3 =10) at block 218. 

of X in the statement **while (X<10)" can be derived from Since another phi function is present in the control flow 

either the definition of X provided by basic block 212 or graph 220 at block 226, additional use blocks 236 and 238 

alternatively derived from the definition provided by basic 10 are added to the control flow graph 230 between the phi 

block 222. A phi function is therefore inserted in the control function of block 226 and the definitions of the variables in 

flow graph 220 as a basic block 226 located between basic the argument of that phi function. Block 236 with ausefpCj) 

blocks 212 and 214. An additional indexed variable %, B is of the variable X t is inserted between the phi function 0(X 1( 

provided to represent the new definitional instance of the X 5 ) at block 226 and the definition of the variable X x at 

variable X that has been afforded by the addition of the phi 15 block 212. Similarly block 238 with a use f(Xj) of the 

function. variable Xj that is inserted between the phi function of block 

The argument of a phi function is derived from the 226 and the definition of the variable X5 at block 228. 

differing values of the variable that lead to the phi function. With the addition of the use block 233-238 a backward 

Thus one of the arguments for the phi function at basic block dataflow analysis may be performed on the control flow 

226 is X 1( being derived from the definition of X t at basic 20 graph 230 to determine the live variable sets for each of the 

block 212. The other portion of the argument of the phi basic blocks in the graph. This analysis is preferably per- 

function at basic block 226, however, is derived from basic formed as an iterative process. In each iteration the process 

block 222. As can be seen in FIG. 8, basic block 222 also is begun at the end of the basic block In performing this 

represents a convergence of differing definitions of the analysis a variable that is used in a basic block is added to 

variable X. That is, the definition of X at basic block 222 25 the live variable set for that block. If the variable is subse- 

may be derived from either of basic block 218 with the quently defined in the same basic block, however, the 

definition iI X 3 =10" or from basic block 219 with the defi- variable is then removed from the live set In accordance 

nition **X 4 =X 2 +1". As further shown in FIG. 8, a phi wimstiU a further aspect of the invention, a use of a variable 

function is therefore also inserted at basic block 228, imme- in the argument of a phi function does not cause that variable 

diately preceding basic block 222, and provided with a new 30 to be added to the live set 

indexed variable lt X 5 ". This new indexed variable Xj in in FIG, 10A there is shown the set of live variables for 

basic block 228 provides a second and final portion of the each of the basic blocks 212-238 in the control flow graph 

argument of the phi function in basic block 226. The phi of FIG. 9 after the first iteration of the backward dataflow 

function in basic block 226 can now be completely analysis has been performed. Considering for example basic 

expressed as "Xa=0(X„ X5)". The arguments for the phi 35 block 212, the variable Y is added to the live set for block 

function at basic block 228 are, in turn, provided from the 212 since Y is used in the expression X^Y. The variable X A 

differing definitions of X converging at basic block 228, that is not added to the live set for basic block 212, however, 

is, from basic blocks 218 and 219. Thus, the definition since Xj is merely defined in block 212. Similarly the live 

X3=10 in basic block 218 and the definition X 4 =X 2 +1 in variable set of basic block 219 includes X 2 since this 

basic blocks 219 provide the arguments for the phi function 40 variable is used in block 219, but does not include X4 since 

at basic block 228, which can thus be fully expressed as this variable is only defined in block 219. Basic blocks 216, 

X5=<B(X 3 , X 4 ). 218 and 222 can be seen to have empty live variable sets 

With the addition of the phi functions to the control flow since each of these blocks do not contain any uses of a 

graph 220 static single assignment transform of the program variable. The only uses of variables in basic blocks 226 and 

portion represented by the control flow graph 210 in FIG. 7 45 228 are within the arguments of phi functions and are thus, 

is complete. Use of this transform to determine live variable in accordance with the invention, ignored. Consequently the 

sets, however, has still been found to provide inaccurate live variable sets of basic blocks 226 and 228 are also empty, 

results, with an excessive number of variables in the live set The basic blocks 233-238 were expressly added to the 

that are not, in fact, live. In accordance with a further aspect control flow graph 230 to represent uses of various variables, 

of the invention, the static single assignment transform of a 50 as discussed above. The live set for each of these basic 

computer program is further modified with the addition of blocks (as shown in FIG. 10A) reflects their express use of 

basic blocks in the control flow graph representing **uses n of the relevant variables. The rernaining two basic blocks in 

the indexed variables in the arguments of the phi functions. control flow graph 230, blocks 214 and 224, each contain a 

Each such * W block includes a use of just one of the use of the variable Xj, as further shown in FIG. 10A. 

variables found in the argument of the phi function. These 55 In accordance with this invention the process of the 

use blocks are added to the control flow graph between the backward dataflow analysis is preferably iteratively per- 

phi functions and the differing definitions of the variable formed. In a second iteration of the backward dataflow 

converging in the control flow graph at the phi function. analysis of the invention, the live variable sets of the basic 

Referring to FIG. 9 there is shown a control flow graph blocks immediately subsequent to the basic block under 

230 including use blocks 233, 234, 236 and 238 that were 60 consideration are also considered in the live variable set 

added to the control flow graph 220 of FIG. 8 in accordance analysis. These subsequent blocks are normally termed the 

with the invention. The phi function of block 228, for child blocks, and the basic block immediately prior to the 

example, has the variables X 3 and X 4 in its argument child block is normally termed the parent block. Thus, the 

Accordingly, use blocks 233 and 234 are added to the union of the live variable sets for the children blocks are also 

control flow graph 230 between the phi function of block 65 included in the determination of the live variable set for the 

228 and the definitions of X 3 and X4 at blocks 218 and 219, parent block in the second iteration of the process. This 

respectively. Block 233, having a use f(X^) of the variable analysis is further preferably initiated at the last basic block, 
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block 224, and proceeds in reverse fashion through the 
control flow graph, ending with the first basic block, block 
212, of the control flow graph 230 shown in FIG. 9. 

In FIG. 10B there is shown a live variable set far each of 
the basic blocks 212-238 in control flow graph 230 after the 
second iteration of the backward dataflow process of the 
invention. Beginning with basic block 224 at the end of 
control flow graph 230, the live set for this block remains 
unchanged (still containing only a use of X 2 ) since block 224 
has no children blocks. The live variable set of basic block 
214 also remains unchanged. The variable X 2 was already in 
the live variable set for block 214 after the first iteration of 
the process because of the use of this variable in the 
argument of the **while" statement (X 2 <10). Since X 2 is 
already in the live variable set for block 214 no change is 
made by consideration of this variable in the live set of block 
224, the child to block 214. 

By comparison of FIGS. 10A and 10B it can be further be 
seen that the live variable set of basic block 226 also remains 
unchanged, although for a markedly different reason. Basic 
block 214 is the child of block 226. Thus in the second 
iteration, the variable Xj in the live set of basic block 216 
is considered in the determination of the live variable set for 
block 226. The variable X 2 is therefore initially added to the 
live set of basic block 226, but then subsequently removed 
since block 226 also contains a definition of the variable X 2 
in the expression of the phi function X 2 =0(X 1? X 5 ). 

In considering basic block 238, the child block 226 of 
block 238 has been shown to still have an empty live 
variable set Thus the live variable set of parent block 238 
remains unchanged. The live variable set of basic block 222, 
however, does change in this second iteration of the process. 
In the first iteration, block 222 had an empty live variable 
set Addition of the live set for basic block 238, the child of 
basic block 222, adds the variable Xj to the live set of block 
222. 

Consideration of the live variable set for basic block 238 
is also helpful in illustrating the backwards dataflow process 
of the invention. In the first iteration, the only variables used 40 
in block 228, variables X 2 and J^, were arguments of a phi 
function, and are therefore ignored. Since the variable X 3 
was only "defined" in block 228, it also was not added to the 
live variable set of block 228 in the first iteration of the 
process. Adding the live variable set of basic block 222, the 45 
child of block 228, in the second iteration of the process the 
variable X 3 is now first added to the live variable set of block 
228, but then removed from the live set since variables that 
are initially added to the live set because of a "use" are then 
deleted if subsequently "defined" within the basic block. 50 
Since the dataflow analysis is performed backwards, the 
definition of the variable X 3 in basic block 228 is subsequent 
to the addition of the variable X5 provided by adding the live 
variable set of the child block 222. Continuing, the live 
variable set of basic block 218 remains unchanged and an 55 
empty set in the second iteration of the process. Although 
variable X 2 in the live set of basic block 234 is first added 
to the live set of block 218, it is then removed because of the 
subsequent definition of the variable X 2 in block 218. The 
variable X 4 is similarly first added to the live set of basic ^ 
block 219 (since it is in the live set of block 233, the child 
of block 219) but then removed from the live set because of 
the subsequent definition of die variable X» in block 219. 

Comparing the live variable sets of the various basic 
blocks shown in FIGS. 10A and 10B, it can be seen that 65 
changes in the live variable sets occur in both basic blocks 
216 and 222. As noted above the variable X 3 was added to 
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the live set of block 222 when the variables in the live set of 
block 238, the child of block 222, were considered in the 
analysis of block 222. Similarly, the first iteration of the 
process yielded an empty set for block 216 since no vari- 
5 ables are used in block 216. In the second iteration, however, 
the variables in the live sets of the child blocks of block 216 
(blocks 218 and 219) were considered. Although block 218 
has an empty live variable set, the live variable set of block 
219 includes the variable X 2 , 

In accordance with yet another aspect of the invention, the 
step of determining the live variable sets for each of the 
basic blocks in the control flow graph is iteratively repeated 
until the results remain unchanged. In each of the later 
iterations, the union of the live variable sets for the child 
blocks are again considered in the determination of the live 
set for the parent block In the example presented above the 
live variable sets for blocks 216 and 222 were altered 
between the first and second iterations. Thus the process is 
repeated in a third iteration. In FIG. 10C there is shown the 
live sets for each of basic blocks 212-238 in control flow 
graph 230 after performing a third iteration of the backward 
dataflow process presented above. By comparing FIGS. 10B 
and IOC, it can be seen that none of the live variable sets 
have changed between the second and third iterations for 
any of the basic blocks 212-238. The process of determining 
the live variable set is therefore completed. 

With the determination of the live variable set for the 
basic block in the program, a variety of register allocation 
and variable prioritization techniques may be employed to 
avoid spill-over. As noted above spill-over occurs when the 
number of variables that must be manipulated during the 
execution of the program exceeds the number of registers in 
the CPU. 

As an alternative technique for register allocation, execu- 
tion of the program may be divided so that a program portion 
having a live variable set of fewer members than the number 
of CPU registers may first be executed and then a realloca- 
tion of CPU registers may subsequently be performed before 
the next portion of the program is executed. The present 
invention thus provides more efficient process for avoiding 
spill-over and for speeding up the execution of target com- 
puter programs. The execution of the variable prioritization 
and register allocation processes, such as those involving 
graph coloring may be performed by the computer illus- 
trated in FIG. 1. In this instance the live variable sets 
determined in accordance with the process of the invention 
may be stored in memory 4. 

It will be appreciated by those skilled in the art that 
various modifications and alterations may be made in the 
preferred embodiments of the invention disclosed herein 
without departing from the scope of this invention. 
Accordingly, the scope of the invention is not to be limited 
to the particular invention embodiments discussed above, 
but should be defined only by the claims set forth below and 
equivalents thereof. 
What is claimed is: 

1. A computer system having a central processing unit 
(CPU) and random access memory (RAM) coupled to said 
CPU, for use in compiling a target program to run on a target 
computer architecture having a fixed number of CPU 
registers, said target program having at least one basic block 
and wherein "use" blocks are inserted in a control flow graph 
which inhibit needless variable spilling to memory, said 
computer system comprising: 
a compiler system resident in said computer system 
having a front end compiler, a code optimizer and a 
back end code generator; and 
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said code optimizer configured to determine a set of live 
variables for said basic block in said target program, 
wherein said fixed number of CPU registers in said 
target computer architecture can be allocated to said set 
of live variables in a manner which minimizes a 
number of variables in said set of live variables that 
must be stored in said memory instead of in said fixed 
number of CPU registers, 

wherein said code optimizer inserts, in at least a portion 
of a control flow graph, a dummy block representing 
usage of a variable in said control flow graph prior to 
a phi-function node and determines said set of live 
variables by considering said dummy block, and 
wherein said code optimizer is further configured to 
perform a static single assignment transform on said 
target program and to add said phi function node to said 
control flow graph representation of said target 
program, and wherein said dummy block is a use block 

2. The computer system of claim 1 wherein said code 
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variables by considering said dummy block; wherein 
said code optimizer is further configured to perform a 
static single assignment transform on said target 
program, and to add said phi function node to said 
control flow graph representation of said target pro- 
gram and wherein said dummy block is a use block; and 
a back end code generator portion coupled to said code 
optimizer configured to accept said second intermedi- 
ate code set as input and to generate binary code which 
will run on said target computer architecture. 

8. The compiler system of claim 7 wherein said code 
optimizer is further configured to determine said set of five 
variables by using a backward dataflow analysis in which 
said variable is an argument in said phi function node and 
said variable is not considered as a use within said phi 
function node. 

9. The compiler system of claim 8 wherein said backward 
dataflow analysis is performed by beginning at an end of 
said basic block and adding to said set of live variables those 



optimizer is further configured to determine said set of live 20 variables live at a beginning of one or more children blocks 



variables by using a backward dataflow analysis in which 
said variable is an argument in said phi function node and 
said variable is not considered as a use within said phi 
function node. 

3. The computer system of claim 2 wherein said backward 
dataflow analysis is performed by beginning at an end of 
said basic block and adding to said set of live variables those 
variables live at a beginning of one or more children blocks 
of said basic block and adding to said set of live variables 
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of said basic block and adding to said set of live variables 
each usage instance of said variable and deleting from said 
set of live variables each subsequent definition of said 
variable. 

10. The compiler system of claim 9 wherein said back- 
ward dataflow analysis is to be iteratively performed until 
said set of live variables far said basic block remains 
constant between successive iterations. 

11. The computer system of claim 7 wherein said phi 



each usage instance of said variable and deleting from said 30 function is added to said control flow graph representation of 
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set of live variables each subsequent definition of said 
variable. 

4. The computer system of claim 3 wherein said backward 
dataflow analysis is iteratively performed until said set of 
live variables for said basic block remains constant between 
successive iterations. 

5. The computer system of claim 1 wherein said phi 
function is added to said control flow graph representation of 
said target program by said code optimizer whenever mul- 
tiple definitions of said variable reach a use of that variable. 

6. The computer system of claim 5 wherein said dummy 
block is inserted into said control flow graph between said 
phi function having said variable in its argument and a 
preceding block containing a definition of said variable. 

7. A compiler system for compiling a target program to 
run on a target computer architecture having a memory and 
a plurality of CPU registers, said compiler system compris- 
ing: 

a front end portion configured to accept source code of 
said target program as input and to output a correspond- 
ing intermediate code set; 

a code optimizer portion coupled to said front end portion 
and configured to accept said intermediate code set as 
input and to output a second intermediate code set, 

wherein said second intermediate code set comprises code 
for said target program that allocates said plurality of 
CPU registers of said target computer architecture to a 
set of live variables for basic blocks in said target 
program in a manner which rninimizes a number of go 
variables in said set of live variables that must be stored 
in said memory instead of in said plurality of CPU 
registers, 

wherein said code optimizer inserts, in at least a portion 
of a control flow graph, a dummy block representing 65 
usage of a variable in said control flow graph prior to 
a phi-function node and determines said set of live 



said target program by said code optimizer whenever mul- 
tiple definitions of said variable reach a use of that variable. 

12. The computer system of claim U wherein said 
dummy block is inserted into said control flow graph 
between said phi function having said variable in its argu- 
ment and a preceding block containing a definition of said 
variable. 

13. A code optimizer for use in an compiler system for 
compiling a target program to run on a target computer 

40 architecture having a memory and a plurality of CPU 
registers, said code optimizer comprising: 
a first portion configured to accept as input an interme- 
diate code representation of said target program; 
a second portion, coupled to said first portion, configured 
to determine a set of live variables for at least one basic 
block of said target program; and 
a third portion, coupled to said second portion, configured 
to allocate said plurality of CPU registers of said target 
computer architecture to said set of live variables for 
said basic blocks of said target program in a manner 
that minimizes a number of variables in said set of live 
variables that must be stored in said memory instead of 
in said plurality of CPU registers, 
wherein said third portion inserts, in at least a portion of 
a control flow graph, a dummy block representing 
usage of a variable in said control flow graph prior to 
a phi-function node and determines said set of live 
variables by considering said dummy block, wherein 
the third portion is further configured to perform a 
static single assignment transform on said target pro- 
gram and to add said phi function node to said control 
flow graph representation of said target program, and 
wherein said dummy block is a use block. 

14. The code optimizer of claim 13 wherein said third 
portion is further configured to determine said set of live 
variables by using a backward dataflow analysis in which 
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said variable is an argument in said phi function node and 
said variable is not considered as a use within said phi 
function node. 

15. The code optimizer of claim 14 wherein said back- 
ward dataflow analysis is performed by beginning at an end 
of said basic block and adding to said set of live variables 
those variables live at a beginning of one or more children 
blocks of said basic block and adding to said set of live 
variables each usage instance of said variable and deleting 
from said set of live variables each subsequent definition of 
said variable. 

16. Hie code optimizer of claim 15 wherein said back- 
ward dataflow analysis is iteratively performed until said set 
of live variables for said basic block remains constant 
between successive iterations. 

17. The computer system of claim 13 wherein said phi 
function is added to said control flow graph representation of 
said target program by said third portion whenever multiple 
definitions of said variable reach a use of that variable. 

18. The computer system of claim 17 wherein said 20 
dummy block is inserted into said control flow graph 
between said phi function having said variable in its argu- 
ment and a preceding block containing a definition of said 
variable. 

19. A computer controlled method of allocating a live set 
of variables in a target program to a fixed number of CPU 
registers in a target computer architecture in a manner that 
minimizes a number of variables in said live set of variables 
mat must be stored in a memory instead of in said fixed 
number of CPU registers; said target program having at least 
one basic block said method comprising steps of: 

a) constructing an interference graph representing a pro- 
cedure of said target program; 

b) determining said live set of variables for said target 
program by sub-steps of: 

bl) performing a static single assignment transform of 
said procedure of said target program, including a 
step of adding a phi function to a control flow graph 
of said target program; 

b2) inserting a use block representative of a use of a 
variable in said control flow graph between said phi 
function and a block containing a definition of said 
variable; and 

b3) determining said live set of variables for said basic 
block in said target program by a backward dataflow 45 
analysis of said control flow graph without including 
arguments of said phi function wherein said back- 
ward dataflow analysis is performed beginning at an 
end of said basic block, initially adding to said live 
set of variables those variables live at a beginning of 
one or more children blocks of said basic block and 
adding to said live set of variables each instance of 
a use of a variable and deleting from said live set of 
variables each subsequent definition of said variable; 
and 

c) mapping said live set of variables to said fixed number 
of CPU registers. 
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20. The method of claim 19 wherein the backward data- 
flow analysis is iteratively performed until said live set of 
variables remain constant between successive iterations. 

21. The method of claim 19 wherein said phi function is 
added to said control flow graph where multiple definitions 
of said variable reach a use of that variable. 

22. A computer controlled method of optimizing binary 
code of a target program which is compiled to run on a target 
computer architecture having a memory and a fixed number 
of CPU registers, said method comprising steps of: 

a) providing a compiler system configured to accept 
source code of said target program and to output binary 
code representing said target program which is capable 
of being processed on said target computer 
architecture, said compiler system comprising a front 
end portion, a code optimizer portion and a back end 
code generator; 

b) providing said code optimizer portion of said compiler 
system configured to accept intermediate code from 
said front end portion of said compiler system and to 
allocate a live set of variables in said intermediate code 
representing said target program to said fixed number 
of CPU registers in said target computer architecture in 
a manner that minimizes a number of variables in said 
live set of variables that must be stored in said memory 
instead of in said fixed number of CPU registers, 

wherein allocation of said live set of variables in said 
intermediate code representing said target program to 
said fixed number of CPU registers is performed by 
sub- steps: 

(bl) constructing an interference graph representing a 
procedure of said target program; determining said 
live set of variables for said target program by 
sub-steps of: 

(bla) performing a static single assignment trans- 
form of said procedure of said target program, 
including adding a phi function to a control flow 
graph of said target program; 
(bib) inserting a use block representing usage of a 
variable in said control flow graph between said 
phi function and a defining block containing a 
definition of said variable; and 
(blc) detennining said live set of variables for basic 
blocks in said target program by a backward 
dataflow analysis of said control flow graph; 
(b2) mapping said live set of variables to said fixed 
number of CPU registers; and 
(c) outputting a second intermediate code version of said 
target program to said back end code generator, said 
second intermediate code version of said target pro- 
gram containing the allocation of said live set of 
variables in said target program to said fixed number of 
CPU registers in said target computer architecture in a 
manner that minimizes a number of variables in said 
live set of variables mat must be stored in said memory 
instead of in said fixed number of CPU registers, 
whereby said target program binary code is optimized. 
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